- 
                Notifications
    You must be signed in to change notification settings 
- Fork 239
feat(compass-collection): Output Validation Followup - Mock Data Generator LLM CLOUDP-347048 #7365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…p - Mock Data Generator
        
          
                ...mpass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx
              
                Outdated
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the mock data generator modal with comprehensive validation and rendering improvements for faker method arguments. The changes focus on improving argument validation, centralizing validation logic, and enabling UI display of faker arguments.
- Adds comprehensive validation for faker arguments including nested structures, type checking, and size limits
- Refactors validation logic from collection-tab.ts to a shared utility module for better maintainability
- Updates UI components to display faker arguments as read-only text inputs with proper type handling
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description | 
|---|---|
| collection-tab.ts | Refactors validation to use centralized utility function | 
| utils.ts | Introduces comprehensive faker argument and method validation utilities | 
| utils.spec.ts | Adds extensive unit tests for validation functions | 
| script-generation-utils.ts | Extends FakerArg type to support nested arrays | 
| mock-data-generator-modal.spec.tsx | Adds integration test for oversized argument handling | 
| faker-schema-editor-screen.tsx | Passes faker arguments to mapping selector | 
| faker-mapping-selector.tsx | Implements UI rendering for faker arguments | 
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
        
          
                packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                packages/compass-collection/src/components/mock-data-generator-modal/utils.ts
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                packages/compass-collection/src/components/mock-data-generator-modal/utils.ts
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                packages/compass-collection/src/components/mock-data-generator-modal/utils.ts
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | Assigned  | 
| expect(result.fakerArgs).to.deep.equal([]); | ||
| }); | ||
|  | ||
| it('returns false for valid method with invalid arguments and fallback fails', () => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be invalid arguments with valid method falls back to true and strips args? That's what's happening I think, which is why this test is failing right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, forgot to update it.
| expect(result.fakerArgs).to.deep.equal([]); | ||
| }); | ||
|  | ||
| it('returns false for method with invalid fakerArgs', () => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment applies here too I believe. Both tests should expect isValid: true and fakerArgs: []
        
          
                packages/compass-collection/src/components/mock-data-generator-modal/utils.ts
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | } else if (typeof arg === 'boolean') { | ||
| // booleans are always valid, continue | ||
| } else if (Array.isArray(arg)) { | ||
| if (!areFakerArgsValid(arg)) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we limit the depth of this to prevent potential infinite recursion? We could pass a depth param and limit to say, 3 levels
| callable(); | ||
| return { isValid: true, fakerArgs: [] }; | ||
| } | ||
| } catch { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A warning when there's an exception or the args are invalid would be valuable for fine tuning the prompt
        
          
                ...mpass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx
          
            Show resolved
            Hide resolved
        
              
          
                packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | The faker arg form experience itself is interesting, I wonder how well users can intuit that the args are positional. Could be a follow-up. Maybe the user can be shown an inline preview of the faker call (cc @jcobis), right now they'd need to make the connection by reading the Script output | 
| 
 Yeah, good question. If we can't show a specific label for each faker argument, an inline preview of the faker call would be good UI to show. | 
| 
 What if: if there are args, we just display the whole sample function call with args in a readonly text input? This is what I had in mind. Thoughts? @ncarbon @kpamaran | 
| 
 I have confidence the sample call should be displayed. I'm not sure if it should be in an input, because the preview is not part of the form data (non-semantic usage). It could be some inline text above the field set. @jcobis @ncarbon | 
| const MAX_FAKER_ARGS_LENGTH = 10; | ||
| const MAX_FAKER_STRING_LENGTH = 1000; | ||
| const MAX_FAKER_ARGS_DEPTH = 3; | ||
| const MAX_FAKER_NUMBER_SIZE = 1000; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on these values? @jcobis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we could bump MAX_FAKER_NUMBER_SIZE to 10000 at least
…gs before call, update ui label
| fontWeight: 600, | ||
| }); | ||
|  | ||
| const parseFakerArg = (arg: FakerArg): string => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there's a parseFakerArgs with a different behavior. One should be renamed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find another function you mentioned, but the name still doesn't make too much sense: it seems to be the opposite of parse, takes the arguments and serializes them to string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming to stringifyFakerArg
| fontWeight: 600, | ||
| }); | ||
|  | ||
| const parseFakerArg = (arg: FakerArg): string => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find another function you mentioned, but the name still doesn't make too much sense: it seems to be the opposite of parse, takes the arguments and serializes them to string
| if (typeof arg === 'object' && arg !== null && 'json' in arg) { | ||
| try { | ||
| return JSON.stringify(JSON.parse(arg.json)); | ||
| } catch { | ||
| return ''; | ||
| } | ||
| } | ||
| return arg.toString(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the type definition, doesn't look like the array case is being handled well at all. If arg is FakerArg[] (which is part of the FakerArg type, it recursively references itself), you will just call prototype toString on it. Is this a mistake in how types are defined or in the method logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, going to handle arrays recursively
| function isAllowedHelper(moduleName: string, methodName: string) { | ||
| return ( | ||
| moduleName !== 'helpers' || | ||
| methodName === 'arrayElement' || | ||
| methodName === 'arrayElements' | ||
| ); | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to understand the intent here reading this expression, also function name doesn't match what's going on here, we're not validating only the helpers, but all method names. I would suggest to restructure it to be more explicit about what we're actually checking:
| function isAllowedHelper(moduleName: string, methodName: string) { | |
| return ( | |
| moduleName !== 'helpers' || | |
| methodName === 'arrayElement' || | |
| methodName === 'arrayElements' | |
| ); | |
| } | |
| function isAllowedFakerFn(moduleName: string, methodName: string) { | |
| if (moduleName !== 'helpers') { | |
| return true // all non-helper modules are allowed unconditionally | |
| } | |
| return methodName === 'arrayElement' || methodName === 'arrayElements' // only array helpers are allowed | |
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored for readability
| if (typeof arg === 'object' && arg !== null && 'json' in arg) { | ||
| return JSON.parse((arg as { json: string }).json); | ||
| } | ||
| return arg; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FakerArg is recursively an array of the same type and this is not handled. Is this type actually defined correctly? There is a lot of code here now that uses FakerArg[], but FakerArg type already includes an array, so either marking it as an array is redundant, or the type is wrong and it's always supporsed to be a flat array of arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The recursive type definition is to support nested arrays.
| import type { FakerArg } from './script-generation-utils'; | ||
| import { faker } from '@faker-js/faker/locale/en'; | ||
|  | ||
| const MAX_FAKER_ARGS_LENGTH = 10; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through the docs I can't see a case where it's ever more than 1 argument actually, any reason not to limit this more aggresively?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see some examples of 2 arguments, such as fromCharacters and arrayElements, but none more than that. Will reduce this to 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the issue, this is just an imprecise variable name. Should be MAX_ARRAY_LENGTH, will rename. I can also add validation to ensure top-level args are limited to 2 at most.
| typeof (arg as { json?: unknown }).json === 'string' | ||
| ) { | ||
| try { | ||
| const parsedJson = JSON.parse((arg as { json: string }).json); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use an in check here which will also make it so that typescript know that property exists, so that you don't need to do extra assertions later:
| typeof (arg as { json?: unknown }).json === 'string' | |
| ) { | |
| try { | |
| const parsedJson = JSON.parse((arg as { json: string }).json); | |
| 'json' in arg && typeof arg.json === 'string' | |
| ) { | |
| try { | |
| const parsedJson = JSON.parse(arg.json); | 
| function prepareFakerArgs(args: FakerArg[]) { | ||
| return args.map((arg) => { | ||
| if (typeof arg === 'object' && arg !== null && 'json' in arg) { | ||
| return JSON.parse((arg as { json: string }).json); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant assertion
| return JSON.parse((arg as { json: string }).json); | |
| return JSON.parse(arg.json); | 
Description
This pull request introduces more validation and rendering improvements for handling faker method arguments in the mock data generator modal. The main changes include adding comprehensive validation logic for faker arguments and method names, updating UI components to display faker arguments, and refactoring validation logic out of the module to a shared utility.
Validation and Utility Enhancements:
areFakerArgsValidand improvedisValidFakerMethodutility functions inutils.tsto thoroughly validate faker arguments (including nested and object types) and method names, enforcing limits on argument length, string size, and value types.collection-tab.tsto use the newisValidFakerMethodutility, ensuring faker method and argument validation is consistent and centralized. [1] [2] [3]Type and Interface Updates:
FakerArgtype to support nested arrays, enabling more flexible representation of faker arguments.UI and Rendering Improvements:
FakerMappingSelectorto renderTextInputcomponents for each faker argument, supporting various types (string, number, boolean, arrays, and objects) and displaying them in a read-only format. [1] [2] [3]FakerSchemaEditorContentto pass the active faker arguments to the mapping selector for display. [1] [2]Testing:
areFakerArgsValidandisValidFakerMethodinutils.spec.ts, covering edge cases such as deeply nested structures, invalid argument types, and method existence.Checklist
Motivation and Context
Open Questions
Dependents
Types of changes